8 research outputs found
Understanding the Mechanics of SPIGOT: Surrogate Gradients for Latent Structure Learning
Latent structure models are a powerful tool for modeling language data: they
can mitigate the error propagation and annotation bottleneck in pipeline
systems, while simultaneously uncovering linguistic insights about the data.
One challenge with end-to-end training of these models is the argmax operation,
which has null gradient. In this paper, we focus on surrogate gradients, a
popular strategy to deal with this problem. We explore latent structure
learning through the angle of pulling back the downstream learning objective.
In this paradigm, we discover a principled motivation for both the
straight-through estimator (STE) as well as the recently-proposed SPIGOT - a
variant of STE for structured models. Our perspective leads to new algorithms
in the same family. We empirically compare the known and the novel pulled-back
estimators against the popular alternatives, yielding new insight for
practitioners and revealing intriguing failure cases.Comment: EMNLP 202
Fact Checking in Community Forums
Community Question Answering (cQA) forums are very popular nowadays, as they
represent effective means for communities around particular topics to share
information. Unfortunately, this information is not always factual. Thus, here
we explore a new dimension in the context of cQA, which has been ignored so
far: checking the veracity of answers to particular questions in cQA forums. As
this is a new problem, we create a specialized dataset for it. We further
propose a novel multi-faceted model, which captures information from the answer
content (what is said and how), from the author profile (who says it), from the
rest of the community forum (where it is said), and from external authoritative
sources of information (external support). Evaluation results show a MAP value
of 86.54, which is 21 points absolute above the baseline.Comment: AAAI-2018; Fact-Checking; Veracity; Community-Question Answering;
Neural Networks; Distributed Representation
Modeling Structure with Undirected Neural Networks
Neural networks are powerful function estimators, leading to their status as
a paradigm of choice for modeling structured data. However, unlike other
structured representations that emphasize the modularity of the problem --
e.g., factor graphs -- neural networks are usually monolithic mappings from
inputs to outputs, with a fixed computation order. This limitation prevents
them from capturing different directions of computation and interaction between
the modeled variables.
In this paper, we combine the representational strengths of factor graphs and
of neural networks, proposing undirected neural networks (UNNs): a flexible
framework for specifying computations that can be performed in any order. For
particular choices, our proposed models subsume and extend many existing
architectures: feed-forward, recurrent, self-attention networks, auto-encoders,
and networks with implicit layers. We demonstrate the effectiveness of
undirected neural architectures, both unstructured and structured, on a range
of tasks: tree-constrained dependency parsing, convolutional image
classification, and sequence completion with attention. By varying the
computation order, we show how a single UNN can be used both as a classifier
and a prototype generator, and how it can fill in missing parts of an input
sequence, making them a promising field for further research.Comment: ICML 202
Automatic Fact-Checking Using Context and Discourse Information
We study the problem of automatic fact-checking, paying special attention to the impact of contextual and discourse information. We address two related tasks: (i) detecting check-worthy claims and (ii) fact-checking claims.We develop supervised systems based on neural networks, kernel-based support vector machines, and combinations thereof, which make use of rich input representations in terms of discourse cues and contextual features. For the check-worthiness estimation task, we focus on political debates, and we model the target claim in the context of the full intervention of a participant and the previous and following turns in the debate, taking into account contextual meta information. For the fact-checking task, we focus on answer verification in a community forum, and we model the veracity of the answer with respect to the entire question-answer thread in which it occurs as well as with respect to other related posts from the entire forum. We develop annotated datasets for both tasks and we run extensive experimental evaluation, confirming that both types of information-but especially contextual features-play an important role